In [1]:
from google.colab import drive      
drive.mount('/content/gdrive')
Mounted at /content/gdrive
In [14]:
!nvidia-smi
Tue Nov  3 14:20:49 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   38C    P8     9W /  70W |      0MiB / 15079MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Installations

In [2]:
%%shell
# Download TorchVision repo to use some files from
# references/detection
git clone https://github.com/pytorch/vision.git
cd vision
git checkout v0.3.0

cp references/detection/utils.py ../
cp references/detection/transforms.py ../
cp references/detection/coco_eval.py ../
cp references/detection/engine.py ../
cp references/detection/coco_utils.py ../
Cloning into 'vision'...
remote: Enumerating objects: 11076, done.
remote: Total 11076 (delta 0), reused 0 (delta 0), pack-reused 11076
Receiving objects: 100% (11076/11076), 12.59 MiB | 21.88 MiB/s, done.
Resolving deltas: 100% (7723/7723), done.
Note: checking out 'v0.3.0'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at be37608 version check against PyTorch's CUDA version
Out[2]:

In [3]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from glob import glob
import cv2
import matplotlib.pyplot as plt
import json
from pprint import pprint
from tqdm import tqdm_notebook as tqdm
from PIL import Image, ImageDraw

import torchvision
from torch.utils.data import Dataset, DataLoader

from torchvision.models.detection.faster_rcnn import FastRCNNPredictor,FasterRCNN
from torchvision.models.detection.backbone_utils import resnet_fpn_backbone
from torchvision.models.detection.rpn import AnchorGenerator
from torchvision.models.detection.rpn import RPNHead
from torchvision.datasets import CocoDetection

import torch.optim as optim
from torchvision import transforms


import torch
from torch.utils.data import random_split

from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
import pycocotools
import utils
import transforms as T

# Ignore warnings
import warnings
warnings.filterwarnings("ignore")
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
print('device: ', device)

PATH = '/content/gdrive/My Drive/Training_afeka_object_detection/images'
device:  cuda

Part 1: Getting the face mask data and preprocceening it!

In order to build an object detection model, data is required. without it the model wont learn to detect anything. In this notebook, we build a neural network for detecting face masks. To be precise, we trained a model for detect whether a person is wearing a mask as required. Apart from the images we didn't get any information about the locations and size of the masks we would like to identify, we chose to use a great and simple COCO Annotator tagging tool which allows you to tag the object (masks) and get a json file with the desired label details in coco format. Practical knowledge of this tool will allow us in the future to train models to detect any object we want.

In [ ]:
with open('/content/gdrive/My Drive/finished_total.json') as json_data: # Read annotations file
    data = json.load(json_data)

images = []
for dic in data['images']:
    images.append((dic['file_name'], dic['id'], dic['height'], dic['width']))

We manually extracted for each image its annotations from the file, at the end we created a data frame so that we could continue the processing more easily and draw conclusions about the form of the training.

in the json file for each image has a list with all of its bbox. In the data frame we separated so that each BBOX would appear in a separate row

In [ ]:
def get_dic_format(index, image_id, path, height, width, x_min, y_min, x_max, y_max, w, h, area, label):
    return {
                            'index': index,
                            'image_id': image_id,
                            'path': path,
                            'height': height,
                            'width': width,
                            'x_min': x_min,
                            'y_min': y_min,
                            'x_max': x_max,
                            'y_max': y_max,
                            'w': w,
                            'h': h,
                            'area': area,
                            'label': label               
                        }
path = PATH + '/*'
train_paths = glob(path)
new_data = []
for img in images: # Run over the images from json file
    for p in train_paths:
        name = p.split('/')[-1]
        if img[0] == name: #Finding an image and detailing it by comparing images names
            none_bbox = False
            for dic in data['annotations']:
                if dic['image_id'] == img[1]:
                    box = dic['bbox']
                    y = int(box[0])
                    x = int(box[1])
                    h = int(box[2])
                    w = int(box[3])
                    if h*w != 0:
                        none_bbox = True
                        new_data.append(get_dic_format(img[1], img[0], 
                                                       p, img[2], img[3], 
                                                       x, y, (x+w), (y+h), 
                                                       w, h, dic['area'], 1))
            if not none_bbox:
                new_data.append(get_dic_format(img[1], img[0], 
                                               p, img[2], img[3], 
                                               0, 0, 0, 0, 0, 0, 0, 0))



df = pd.DataFrame.from_dict(new_data)
display(df)
index image_id path height width x_min y_min x_max y_max w h area label
0 56 maksssksksss77.png /content/gdrive/My Drive/Training_afeka_object... 225 400 104 130 136 172 32 42 1344 1
1 56 maksssksksss77.png /content/gdrive/My Drive/Training_afeka_object... 225 400 105 54 134 88 29 34 986 1
2 56 maksssksksss77.png /content/gdrive/My Drive/Training_afeka_object... 225 400 98 0 124 11 26 11 286 1
3 57 maksssksksss49.png /content/gdrive/My Drive/Training_afeka_object... 225 400 107 351 122 367 15 16 240 1
4 57 maksssksksss49.png /content/gdrive/My Drive/Training_afeka_object... 225 400 37 272 49 288 12 16 192 1
... ... ... ... ... ... ... ... ... ... ... ... ... ...
599 209 maksssksksss48.png /content/gdrive/My Drive/Training_afeka_object... 0 0 0 0 0 0 0 0 0 0
600 210 maksssksksss51.png /content/gdrive/My Drive/Training_afeka_object... 0 0 0 0 0 0 0 0 0 0
601 211 maksssksksss68.png /content/gdrive/My Drive/Training_afeka_object... 0 0 0 0 0 0 0 0 0 0
602 212 maksssksksss72.png /content/gdrive/My Drive/Training_afeka_object... 0 0 0 0 0 0 0 0 0 0
603 213 maksssksksss95.png /content/gdrive/My Drive/Training_afeka_object... 0 0 0 0 0 0 0 0 0 0

604 rows × 13 columns

[https://media1.tenor.com/images/42983a95657f874f62cfc1f1152da484/tenor.gif] link text

from IPython.display import Image
Image(url='https://media1.tenor.com/images/42983a95657f874f62cfc1f1152da484/tenor.gif')
# This is formatted as code
In [ ]:
# Helper functions for statics part

def count_boxes(df):
    images_list = df['image_id'].unique()
    with_mask = []
    without_mast = []
    count_boxes = 0
    total = 0
    for name in images_list:
        mini_df = df[df['image_id'] == name]
        if mini_df['label'].values[0] == 1:
            count_boxes+=1
            with_mask.append(*mini_df['path'].unique())
            total += len(mini_df)
        else:
            without_mast.append(*mini_df['path'].unique())

    return count_boxes, with_mask, without_mast, total/count_boxes

def avargae_height(df):
    return df['height'].mean()

def avargae_width(df):
    return df['width'].mean()

def avargae_area(df):
    return df['area'].mean()

from PIL import Image, ImageDraw
def add_bbox(path, size=(600, 600)):
    image = Image.open(path)
    draw = ImageDraw.Draw(image)
    for anno in df[df['path'] == path][['x_min','y_min', 'x_max', 'y_max']].values:
        draw.rectangle([(anno[1], anno[0]), (anno[3], anno[2])], outline ="blue", width =3)
    image = image.resize(size)
    return(image)

def plot_images(paths, title):
    imgs = [add_bbox(path) for path in np.random.choice(paths, 4, replace=False)]
    plt.figure(figsize=(20, 5))
    plt.suptitle(title, fontsize=20)
    for i, img in enumerate(imgs):
        plt.subplot(1, 5, i+1)
        plt.imshow(img)
        plt.axis('off')
    plt.show()
In [ ]:
from IPython.display import Image
Image(url='https://media1.tenor.com/images/42983a95657f874f62cfc1f1152da484/tenor.gif?itemid=8718500')
Out[ ]:
In [ ]:
count_boxes, with_mask, without_mask, avg_boxes = count_boxes(df)
total_lenght = len(df['image_id'].unique())
print(f'Total number of images {total_lenght}')
print(f'Total number of images with face mask {count_boxes}')
print(f'Total number of images without face mask {total_lenght - count_boxes}')
print(f'Avarge image height {avargae_height(df)}')
print(f'Avarge image width {avargae_width(df)}')
print(f'Avarge masks in image {avg_boxes}')

plot_images(with_mask, 'Exampels with face mask')
plot_images(without_mask, 'Exampels without face mask')
Total number of images 200
Total number of images with face mask 187
Total number of images without face mask 13
Avarge image height 265.9933774834437
Avarge image width 384.2251655629139
Avarge masks in image 3.160427807486631

After examining the images we saw that on average there are between 3 and 4 face masks in each image. In addition we noticed that in most of the pictures the amount of people who don't wear face mask is larger, we decided not to define this as "no mask" labels and add this case to background labels. By taking this assumption we ignore the images without any face mask in it.

In [ ]:
#remove negative examples 
df = df[df['label']!=0]
display(df)
index image_id path height width x_min y_min x_max y_max w h area label
0 56 maksssksksss77.png /content/gdrive/My Drive/Training_afeka_object... 225 400 104 130 136 172 32 42 1344 1
1 56 maksssksksss77.png /content/gdrive/My Drive/Training_afeka_object... 225 400 105 54 134 88 29 34 986 1
2 56 maksssksksss77.png /content/gdrive/My Drive/Training_afeka_object... 225 400 98 0 124 11 26 11 286 1
3 57 maksssksksss49.png /content/gdrive/My Drive/Training_afeka_object... 225 400 107 351 122 367 15 16 240 1
4 57 maksssksksss49.png /content/gdrive/My Drive/Training_afeka_object... 225 400 37 272 49 288 12 16 192 1
... ... ... ... ... ... ... ... ... ... ... ... ... ...
586 46 maksssksksss11.png /content/gdrive/My Drive/Training_afeka_object... 267 400 83 214 98 234 15 20 300 1
587 47 maksssksksss107.png /content/gdrive/My Drive/Training_afeka_object... 400 301 229 125 311 210 82 85 6970 1
588 48 maksssksksss129.png /content/gdrive/My Drive/Training_afeka_object... 295 400 47 246 68 270 21 24 504 1
589 49 maksssksksss141.png /content/gdrive/My Drive/Training_afeka_object... 267 400 136 220 166 259 30 39 1170 1
590 49 maksssksksss141.png /content/gdrive/My Drive/Training_afeka_object... 267 400 110 131 153 175 43 44 1892 1

591 rows × 13 columns

Costume face mask dataset

In [4]:
def get_annotations(annotations, device='cuda'):
  ''' convert a single annotation to a none coco format '''

  label =  {
      'boxes': torch.tensor([to_rcnnformat(a['bbox']) for a in annotations]).to(device), 
      'labels': torch.tensor([1 for a in annotations]).to(device), 
      'image_id': torch.tensor(annotations[0]['image_id']).to(device), 
      'area': torch.tensor([a['area'] for a in annotations]).to(device),
      'iscrowd': torch.tensor([0 for a in annotations]).to(device)
      }

  return label

def to_rcnnformat(bbox):
  ''' convert a coco bounding box to a none coco format '''
  return [bbox[0],bbox[1],bbox[0]+bbox[2],bbox[1]+bbox[3]]

class FaceMaskDataset(Dataset):
    """ Face Mask dataset """

    def __init__(self, data, root="", transform=None, cocomode=True):
        """
        Args:
            data (dict) json data
            root (string): location of images
            transform (callable, optional): Optional transform to be applied
                on a sample.
            cocomode (bool): whether use coco data formate
        """

        self.images = data['images']
        self.labels = {img['id']:[d for d in data['annotations'] if d['image_id'] == img['id']] for img in self.images}
        self.root = root
        self.imagetotensor = transforms.Compose([transforms.ToTensor()])
        self.transform = transform
        self.cocomode = cocomode
        self.device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):

        im = Image.open(self.root + '/' + self.images[idx]['file_name']).convert('RGB')
        label_idx = self.images[idx]['id']
        label = self.labels[label_idx]

        if not self.cocomode:
          im = self.imagetotensor(im).to(self.device)
          label = get_annotations(label, self.device)

        if self.transform:
            im = self.transform(im)

        return im, label

Helper functions for initializing the model

In [5]:
def load_resnet(backbone_model, num_classes=2): #backcone_model = ['resnet50', 'resnet101', 'resnet152']
  
    # create an anchor_generator for the FPN
    # which by default has 5 outputs
    anchor_generator = AnchorGenerator(
        sizes=tuple([(32, 64, 128, 256) for _ in range(5)]),
        aspect_ratios=tuple([(0.25, 0.5, 1.0, 2.0) for _ in range(5)]))
    
    pretrained_backbone=True
    backbone = resnet_fpn_backbone(backbone_model, pretrained_backbone)
    fasterrcnn = FasterRCNN(backbone, num_classes,
                    rpn_anchor_generator=anchor_generator,
                    rpn_head=RPNHead(256, anchor_generator.num_anchors_per_location()[0]))
    in_features = fasterrcnn.roi_heads.box_predictor.cls_score.in_features
    # replace the pre-trained head with a new one
    fasterrcnn.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes = num_classes)
    return fasterrcnn

def get_model(architecture:str, num_classes=2):
    anchor_generator = AnchorGenerator(
                        sizes=tuple([(32, 64, 128, 256) for _ in range(5)]),
                        aspect_ratios=tuple([(0.25, 0.5, 1.0, 2.0) for _ in range(5)]))
    if architecture == 'resnet50':
        return load_resnet('resnet50', num_classes)
    if architecture == 'resnet101':
        return load_resnet('resnet101', num_classes)
    if architecture == 'resnet152':
        return load_resnet('resnet152', num_classes)

    if architecture == 'mobilenet':
        backbone = torchvision.models.mobilenet_v2(pretrained=True).features
        backbone.out_channels = 1280
        
    if architecture == 'squeezenet':
        backbone = torchvision.models.squeezenet1_0(pretrained=True).features
        backbone.out_channels = 512

    fasterrcnn = FasterRCNN(backbone, num_classes,
                        rpn_anchor_generator=anchor_generator,
                        rpn_head=RPNHead(backbone.out_channels, anchor_generator.num_anchors_per_location()[0]))
    in_features = fasterrcnn.roi_heads.box_predictor.cls_score.in_features
    # replace the pre-trained head with a new one
    fasterrcnn.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes = num_classes)
    return fasterrcnn
        

Dataset and Dataloaders

In [6]:
with open('/content/gdrive/My Drive/all_without_inannotated.json') as json_data: # Read annotations file
    data = json.load(json_data)

def collate_fn(batch):
    return tuple(zip(*batch))

mmasks_dataset = FaceMaskDataset(data, PATH, cocomode=False)

train_dataset, val_dataset, test_dataset = random_split(mmasks_dataset,  
                                                        [150, 27, 10])
                                            
trian_data_loader = DataLoader(train_dataset, batch_size=4, 
                               shuffle=True, collate_fn=collate_fn)
test_data_loader = DataLoader(test_dataset, batch_size=1, 
                              shuffle=False, collate_fn=collate_fn)
val_data_loader = DataLoader(val_dataset, batch_size=1, 
                             shuffle=False, collate_fn=collate_fn)
In [ ]:
!pip install tensorboard==2.3.0
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter()

Helper functions for training part

In [8]:
def train(model, optimizer, data_loader, device, epoch, print_freq):
    '''We ran the Vision train_one_epoch function
     so that the function returns the training results '''

    model.train()
    metric_logger = utils.MetricLogger(delimiter="  ")
    metric_logger.add_meter('lr', utils.SmoothedValue(window_size=1, fmt='{value:.6f}'))
    header = 'Epoch: [{}]'.format(epoch)

    lr_scheduler = None
    if epoch == 0:
        warmup_factor = 1. / 1000
        warmup_iters = min(1000, len(data_loader) - 1)

        lr_scheduler = utils.warmup_lr_scheduler(optimizer, warmup_iters, warmup_factor)

    for images, targets in metric_logger.log_every(data_loader, print_freq, header):
        images = list(image.to(device) for image in images)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = model(images, targets)

        losses = sum(loss for loss in loss_dict.values())

        # reduce losses over all GPUs for logging purposes
        loss_dict_reduced = utils.reduce_dict(loss_dict)
        losses_reduced = sum(loss for loss in loss_dict_reduced.values())

        loss_value = losses_reduced.item()

        if not math.isfinite(loss_value):
            print("Loss is {}, stopping training".format(loss_value))
            print(loss_dict_reduced)
            sys.exit(1)

        optimizer.zero_grad()
        losses.backward()
        optimizer.step()

        if lr_scheduler is not None:
            lr_scheduler.step()

        metric_logger.update(loss=losses_reduced, **loss_dict_reduced)
        metric_logger.update(lr=optimizer.param_groups[0]["lr"])
    return metric_logger

def evaluate_loss(data_loader):
    ''' The function calculates the loss error for a validation set '''
    model.train()
    with torch.no_grad():
        losses_graph = []
        for images, targets in tqdm(data_loader):
            
            images = list(image.to(device) for image in images)
            targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

            loss_dict = model(images, targets)            
            losses = sum(loss for loss in loss_dict.values())

            losses_graph.append(float(losses.detach().cpu()))
        loss = sum(losses_graph)/len(losses_graph)
        return loss

Precision & recall

  • Precision: measures how accurate is your predictions.
  • Recall: measures how good you find all the positives.

COCO mAP

Latest research papers tend to give results for the COCO dataset only. In COCO mAP, a 101-point interpolated AP definition is used in the calculation. For COCO, AP is the average over multiple IoU (the minimum IoU to consider a positive match). AP@[.5:.95] corresponds to the average AP for IoU from 0.5 to 0.95 with a step size of 0.05. For the COCO competition, AP is the average over 10 IoU levels on 80 categories (AP@[.50:.05:.95]: start from 0.5 to 0.95 with a step size of 0.05). The following are some other metrics collected for the COCO dataset.

alt text

In [9]:
from tqdm import tqdm_notebook as tqdm
from engine import *
def run ():
    '''Conduct a complete training for one model.'''
    
    model_metrics = {'train':
                        {'loss': [],
                        'average precision': [],
                        'average recall': []},
                    'validition':
                        {'loss': [],
                        'average precision': [],
                        'average recall': []}}

    for epoch in tqdm(range(num_epochs)):
        # train for one epoch, printing every 10 iterations
        stats = train(model, optimizer, trian_data_loader, device, epoch, print_freq=10)
        model_metrics['train']['loss'].append(stats.meters['loss'].value)
        eval_metrics = evaluate(model, trian_data_loader, device=device)
        # add to graphs plotting section
        model_metrics['train']['average precision'].append(
                                                            eval_metrics.coco_eval['bbox'].stats[0])
        model_metrics['train']['average recall'].append(
                                                            eval_metrics.coco_eval['bbox'].stats[-4])
        # add to tensorboard
        writer.add_scalar("Loss/ Train", stats.meters['loss'].value, epoch)
        writer.add_scalar("Loss/loss_classifier/ Train", stats.meters['loss_classifier'].value, epoch)
        writer.add_scalar("Loss/loss_box_reg/ Train", stats.meters['loss_box_reg'].value, epoch)
        writer.add_scalar("Loss/loss_objectness/ Train", stats.meters['loss_objectness'].value, epoch)
        writer.add_scalar("Loss/loss_rpn_box_reg/ Train", stats.meters['loss_rpn_box_reg'].value, epoch)
        
        # evaluate on the test dataset
        model_metrics['validition']['loss'].append(evaluate_loss(val_data_loader))
        eval_metrics = evaluate(model, val_data_loader, device=device)
        apt = eval_metrics.coco_eval['bbox'].stats[0]
        art = eval_metrics.coco_eval['bbox'].stats[-4]
        
        # add to graphs plotting section
        model_metrics['validition']['average precision'].append(apt)
        model_metrics['validition']['average recall'].append(art)
        
        # add to tensorboard
        writer.add_scalar("Average Precision/ Train", apt, epoch)
        writer.add_scalar("Average Recall/ Train", art, epoch)

        # update the learning rate
        lr_scheduler.step()

    return model_metrics

Dirty implementation

To decide which model is better we select 5 five optional models. Each modle has initialize weights with pre-trained on ImageNet.

The models was trained with 10 epochs. For setting optimizer, lr scheduler hyperparamters we used the ones from TorchVision Object Detection Finetuning Tutorial. The results from each model were compared base on average precision results. The model which achive the best precision will continue to make adjustments

Models:

  • Resent 50 Number of parameters 25557032
  • Resent 101 Number of parameters 44549160
  • Resent 152 Number of parameters 60192808
  • mobilenet Number of parameters 3504872
  • squeezenet Number of parameters 1248424
Optimizer = SGD(model.parameters(),
        lr=0.005, momentum=0.9, weight_decay=0.0005)
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,
                                                    step_size=3,
                                                    gamma=0.1)
In [ ]:
compare_result = []
CHECKPOINT_DIR_PATH = '/content/gdrive/My Drive/Colab Notebooks/new_result.pt'
# ['resnet50', 'resnet101', 'resnet152', 'mobilenet', 'squeezenet']
for name in ['resnet50', 'resnet101', 'resnet152', 'mobilenet', 'squeezenet']:
    #Set model
    model = get_model(name)
    model.to(device)

    # let's train it for 10 epochs
    num_epochs = 10

    #Set Optimizer
    optimizer = torch.optim.SGD(model.parameters(), lr=0.005,
                                momentum=0.9, weight_decay=0.0005)
    # and a learning rate scheduler
    lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,
                                                    step_size=3,
                                                    gamma=0.1)
    compare_result.append((name, run(), model.state_dict()))

    
    torch.save(compare_result, CHECKPOINT_DIR_PATH)
    del model
In [11]:
res1  = torch.load('/content/gdrive/My Drive/Colab Notebooks/result resnet50 resnet101.pt')
res2  = torch.load('/content/gdrive/My Drive/Colab Notebooks/result2.pt')
In [ ]:
plt.figure(figsize=[20, 10])
plt.subplot(1, 3, 1)
plt.title('Average Precision', fontsize=20)
for result in res1:
    label, details, model= result[:3]
    plt.plot(details['validition']['average precision'], label=label)
for result in res2:
    label, details, model= result[:3]
    plt.plot(details['validition']['average precision'], label=label)
plt.legend()

plt.subplot(1, 3, 2)
plt.title('Average Recall', fontsize=20)
for result in res1:
    label, details, model= result[:3]
    plt.plot(details['validition']['average recall'], label=label)
for result in res2:
    label, details, model= result[:3]
    plt.plot(details['validition']['average recall'], label=label)
plt.legend()

plt.subplot(1, 3, 3)
plt.title('Loss', fontsize=20)
for result in res1:
    label, details, model= result[:3]
    plt.plot(details['validition']['loss'], label=label)
for result in res2:
    label, details, model= result[:3]
    plt.plot(details['validition']['loss'], label=label)
plt.legend()
plt.show()   

From the graphs above we can see that resnet101 got the hightest percision score, it mean that this is the selected Architecture.

This net will undergo further adjustments and will be trained again, but for now let's see the predictions and the success of the net!

In [50]:
from PIL import Image, ImageDraw

def show_prediction(image, predictions, ground_truth=None, threshold=25):
    trans = transforms.ToPILImage()
    image = trans(image.detach().cpu())
    draw = ImageDraw.Draw(image)

    if ground_truth:
        for anno in ground_truth:
            draw.rectangle([(anno[0], anno[1]), (anno[2], anno[3])], 
                        outline ="#6e090c", width =2)
        
    for anno, score in zip(predictions[0]['boxes'], predictions[0]['scores']):
        anno = anno.detach().cpu().numpy().astype(np.int)
        score = score.detach().cpu().numpy().astype(np.float)
        
        rscore = np.round(score*100, 2)
        if rscore > threshold:
            draw.rectangle([(anno[2], anno[3]), (anno[0], anno[1])], 
                        outline ="#bdffff", width =2)
            draw.rectangle([(anno[0], anno[1]-9), 
                            (anno[0]+43, (anno[1]-2))], 
                            fill ="#bdffff")
            draw.text((anno[0], anno[1]-10), f"{rscore} %", (0,0,0))

    display(image)
def test_predictions(model, dataloader, threshold=25):

    for sample_img, sample_ann in test_data_loader:
        model.eval()
        imgs = list(img.to(device) for img in sample_img)
        annotations = []
        for dic in sample_ann:
            for bbox in dic['boxes'].detach().cpu().numpy():
                annotations.append(bbox)
        result_dict = model(imgs)
        show_prediction(imgs[0], result_dict, annotations, threshold)


def load_best_model(name):
    for m in res1:
        if m[0] == name:
            model = get_model(m[0])
            model.load_state_dict(m[2])
            model.to(device)
            return model, m[1]
dirty_model, dirty_result = load_best_model('resnet101')
test_predictions(dirty_model, test_data_loader, 50)

AdaMod

AdaMod is a stochastic optimizer that restricts adaptive learning rates with adaptive and momental upper bounds. The dynamic learning rate bounds are based on the exponential moving averages of the adaptive learning rates themselves, which smooth out unexpected large learning rates and stabilize the training of deep neural networks.

AdaMod is a drop in replacement for Adam. The only change is a new hyperparameter called B3, or Beta3. This controls the degree of lookback for the long term clipping average.

cosine learning decay:

A number of Schedulers were tested to accommodate some learning in training:

StepLR - which requires a lot of time for good adjusting the scheduler hyperparameters.

cosine learing decay - implemented cosine learing decay implementation from pytorch Schedulers which gave a different learning rate (compared to the optimizer learning rate and could not be used as mention in the git) manual cosine learing decay implementation which that solve the problem and achieved a moderate reduction in learning rate throughout training and evalotion learning rate values ​​as the optimizers

references

In [10]:
!pip install adamod
import adamod as adamod
Collecting adamod
  Downloading https://files.pythonhosted.org/packages/08/84/36499b6bd4b0bd06670210b636216780699488f6c7666f766dc6f164db7d/adamod-0.0.3-py3-none-any.whl
Requirement already satisfied: torch>=0.4.0 in /usr/local/lib/python3.6/dist-packages (from adamod) (1.7.0+cu101)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from torch>=0.4.0->adamod) (1.18.5)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.6/dist-packages (from torch>=0.4.0->adamod) (3.7.4.3)
Requirement already satisfied: dataclasses in /usr/local/lib/python3.6/dist-packages (from torch>=0.4.0->adamod) (0.7)
Requirement already satisfied: future in /usr/local/lib/python3.6/dist-packages (from torch>=0.4.0->adamod) (0.16.0)
Installing collected packages: adamod
Successfully installed adamod-0.0.3
In [11]:
from torch.optim.lr_scheduler import _LRScheduler
import math
# https://github.com/pytorch/pytorch/issues/17913
class LegacyCosineAnnealingLR(_LRScheduler):
    r"""Set the learning rate of each parameter group using a cosine annealing
    schedule, where :math:`\eta_{max}` is set to the initial lr and
    :math:`T_{cur}` is the number of epochs since the last restart in SGDR:

    .. math::

        \eta_t = \eta_{min} + \frac{1}{2}(\eta_{max} - \eta_{min})(1 +
        \cos(\frac{T_{cur}}{T_{max}}\pi))

    When last_epoch=-1, sets initial lr as lr.

    It has been proposed in
    `SGDR: Stochastic Gradient Descent with Warm Restarts`_. Note that this only
    implements the cosine annealing part of SGDR, and not the restarts.

    Args:
        optimizer (Optimizer): Wrapped optimizer.
        T_max (int): Maximum number of iterations.
        eta_min (float): Minimum learning rate. Default: 0.
        last_epoch (int): The index of last epoch. Default: -1.

    .. _SGDR\: Stochastic Gradient Descent with Warm Restarts:
        https://arxiv.org/abs/1608.03983
    """

    def __init__(self, optimizer, T_max, eta_min=0, last_epoch=-1):
        self.T_max = T_max
        self.eta_min = eta_min
        super(LegacyCosineAnnealingLR, self).__init__(optimizer, last_epoch)

    def get_lr(self):
        return [self.eta_min + (base_lr - self.eta_min) *
                (1 + math.cos(math.pi * self.last_epoch / self.T_max)) / 2
                for base_lr in self.base_lrs]

In [35]:
CHECKPOINT_DIR_PATH = '/content/gdrive/My Drive/Colab Notebooks/resnet101v2.pt'
In [ ]:
#Set model
model = get_model('resnet101')
# move model to the right device
model.to(device)

# let's train it for 10 epochs
num_epochs = 10

#Set Optimizer
optimizer = adamod.AdaMod(model.parameters(), lr=0.0001, weight_decay=0.0001, beta3=0.999)
# and a learning rate scheduler
lr_scheduler = LegacyCosineAnnealingLR(optimizer, T_max=num_epochs)

# Learning Rate Scheduler
lr_scheduler = LegacyCosineAnnealingLR(optimizer, T_max=num_epochs)
lr_scheduler.eta_min = 1.0000000000000002e-07

compare_result = run()
# CHECKPOINT_DIR_PATH = '/content/gdrive/My Drive/Colab Notebooks/resnet101v2.pt'
torch.save([compare_result, model.state_dict()], CHECKPOINT_DIR_PATH)
Epoch: [0]  [ 0/38]  eta: 0:02:35  lr: 0.000003  loss: 1.5206 (1.5206)  loss_classifier: 0.7124 (0.7124)  loss_box_reg: 0.0002 (0.0002)  loss_objectness: 0.6938 (0.6938)  loss_rpn_box_reg: 0.1142 (0.1142)  time: 4.1004  data: 0.8296  max mem: 11267
Epoch: [0]  [10/38]  eta: 0:01:41  lr: 0.000030  loss: 1.5366 (1.5486)  loss_classifier: 0.7114 (0.7113)  loss_box_reg: 0.0072 (0.0126)  loss_objectness: 0.6940 (0.6943)  loss_rpn_box_reg: 0.1142 (0.1304)  time: 3.6114  data: 1.2905  max mem: 11360
Epoch: [0]  [20/38]  eta: 0:01:02  lr: 0.000057  loss: 1.5366 (1.5484)  loss_classifier: 0.7070 (0.7055)  loss_box_reg: 0.0073 (0.0127)  loss_objectness: 0.6944 (0.6946)  loss_rpn_box_reg: 0.1173 (0.1356)  time: 3.4519  data: 1.2091  max mem: 11809
Epoch: [0]  [30/38]  eta: 0:00:27  lr: 0.000084  loss: 1.4832 (1.5231)  loss_classifier: 0.6846 (0.6919)  loss_box_reg: 0.0097 (0.0129)  loss_objectness: 0.6944 (0.6945)  loss_rpn_box_reg: 0.1026 (0.1238)  time: 3.4137  data: 1.1221  max mem: 11809
Epoch: [0]  [37/38]  eta: 0:00:03  lr: 0.000100  loss: 1.4682 (1.5111)  loss_classifier: 0.6474 (0.6706)  loss_box_reg: 0.0089 (0.0135)  loss_objectness: 0.6937 (0.6943)  loss_rpn_box_reg: 0.1093 (0.1327)  time: 3.5318  data: 1.1598  max mem: 11809
Epoch: [0] Total time: 0:02:12 (3.4901 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:46  model_time: 1.1113 (1.1113)  evaluator_time: 0.0254 (0.0254)  time: 1.2244  data: 0.0875  max mem: 11809
Test:  [37/38]  eta: 0:00:01  model_time: 1.0249 (1.0051)  evaluator_time: 0.0419 (0.0407)  time: 1.0992  data: 0.0700  max mem: 11809
Test: Total time: 0:00:42 (1.1162 s / it)
Averaged stats: model_time: 1.0249 (1.0051)  evaluator_time: 0.0419 (0.0407)
Accumulating evaluation results...
DONE (t=0.12s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.001
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.004
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.008
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.001
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.018
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:07  model_time: 0.2555 (0.2555)  evaluator_time: 0.0058 (0.0058)  time: 0.2720  data: 0.0105  max mem: 11809
Test:  [26/27]  eta: 0:00:00  model_time: 0.1815 (0.1831)  evaluator_time: 0.0091 (0.0104)  time: 0.2080  data: 0.0188  max mem: 11809
Test: Total time: 0:00:05 (0.2119 s / it)
Averaged stats: model_time: 0.1815 (0.1831)  evaluator_time: 0.0091 (0.0104)
Accumulating evaluation results...
DONE (t=0.02s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.005
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.014
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.044
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Epoch: [1]  [ 0/38]  eta: 0:01:41  lr: 0.000098  loss: 1.3474 (1.3474)  loss_classifier: 0.5021 (0.5021)  loss_box_reg: 0.0707 (0.0707)  loss_objectness: 0.6934 (0.6934)  loss_rpn_box_reg: 0.0812 (0.0812)  time: 2.6601  data: 0.0812  max mem: 11809
Epoch: [1]  [10/38]  eta: 0:01:10  lr: 0.000098  loss: 1.0253 (1.0803)  loss_classifier: 0.2032 (0.2455)  loss_box_reg: 0.0100 (0.0184)  loss_objectness: 0.6929 (0.6929)  loss_rpn_box_reg: 0.1304 (0.1234)  time: 2.5009  data: 0.0725  max mem: 11809
Epoch: [1]  [20/38]  eta: 0:00:45  lr: 0.000098  loss: 0.9553 (0.9896)  loss_classifier: 0.0952 (0.1620)  loss_box_reg: 0.0123 (0.0185)  loss_objectness: 0.6891 (0.6857)  loss_rpn_box_reg: 0.1197 (0.1235)  time: 2.5125  data: 0.0722  max mem: 11809
Epoch: [1]  [30/38]  eta: 0:00:20  lr: 0.000098  loss: 0.8229 (0.9437)  loss_classifier: 0.0485 (0.1236)  loss_box_reg: 0.0139 (0.0186)  loss_objectness: 0.6684 (0.6753)  loss_rpn_box_reg: 0.1188 (0.1262)  time: 2.6466  data: 0.0730  max mem: 11809
Epoch: [1]  [37/38]  eta: 0:00:02  lr: 0.000098  loss: 0.7986 (0.9018)  loss_classifier: 0.0311 (0.1061)  loss_box_reg: 0.0059 (0.0160)  loss_objectness: 0.6414 (0.6517)  loss_rpn_box_reg: 0.1224 (0.1280)  time: 2.5663  data: 0.0707  max mem: 11809
Epoch: [1] Total time: 0:01:36 (2.5461 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:45  model_time: 1.1160 (1.1160)  evaluator_time: 0.0022 (0.0022)  time: 1.2054  data: 0.0870  max mem: 11809
Test:  [37/38]  eta: 0:00:01  model_time: 1.1061 (1.0084)  evaluator_time: 0.0012 (0.0016)  time: 1.0985  data: 0.0707  max mem: 11809
Test: Total time: 0:00:41 (1.0839 s / it)
Averaged stats: model_time: 1.1061 (1.0084)  evaluator_time: 0.0012 (0.0016)
Accumulating evaluation results...
DONE (t=0.01s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.001
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.001
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.001
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.003
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:07  model_time: 0.2573 (0.2573)  evaluator_time: 0.0006 (0.0006)  time: 0.2677  data: 0.0098  max mem: 11809
Test:  [26/27]  eta: 0:00:00  model_time: 0.1844 (0.1833)  evaluator_time: 0.0006 (0.0007)  time: 0.1985  data: 0.0192  max mem: 11809
Test: Total time: 0:00:05 (0.2029 s / it)
Averaged stats: model_time: 0.1844 (0.1833)  evaluator_time: 0.0006 (0.0007)
Accumulating evaluation results...
DONE (t=0.00s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Epoch: [2]  [ 0/38]  eta: 0:01:10  lr: 0.000090  loss: 0.6821 (0.6821)  loss_classifier: 0.0344 (0.0344)  loss_box_reg: 0.0008 (0.0008)  loss_objectness: 0.4380 (0.4380)  loss_rpn_box_reg: 0.2089 (0.2089)  time: 1.8571  data: 0.0760  max mem: 11809
Epoch: [2]  [10/38]  eta: 0:01:08  lr: 0.000090  loss: 0.6955 (0.7316)  loss_classifier: 0.0394 (0.0993)  loss_box_reg: 0.0206 (0.0504)  loss_objectness: 0.4159 (0.4273)  loss_rpn_box_reg: 0.1639 (0.1547)  time: 2.4520  data: 0.0738  max mem: 11809
Epoch: [2]  [20/38]  eta: 0:00:45  lr: 0.000090  loss: 0.8108 (0.7764)  loss_classifier: 0.1422 (0.1339)  loss_box_reg: 0.1146 (0.1296)  loss_objectness: 0.3916 (0.3780)  loss_rpn_box_reg: 0.1188 (0.1349)  time: 2.5322  data: 0.0729  max mem: 11809
Epoch: [2]  [30/38]  eta: 0:00:19  lr: 0.000090  loss: 0.9498 (0.8679)  loss_classifier: 0.2081 (0.1827)  loss_box_reg: 0.2926 (0.2047)  loss_objectness: 0.3164 (0.3553)  loss_rpn_box_reg: 0.1015 (0.1251)  time: 2.4700  data: 0.0724  max mem: 11809
Epoch: [2]  [37/38]  eta: 0:00:02  lr: 0.000090  loss: 1.0050 (0.9104)  loss_classifier: 0.2554 (0.2000)  loss_box_reg: 0.3226 (0.2433)  loss_objectness: 0.2810 (0.3387)  loss_rpn_box_reg: 0.1148 (0.1284)  time: 2.2996  data: 0.0673  max mem: 11809
Epoch: [2] Total time: 0:01:31 (2.4109 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:36  model_time: 0.8384 (0.8384)  evaluator_time: 0.0484 (0.0484)  time: 0.9619  data: 0.0749  max mem: 11809
Test:  [37/38]  eta: 0:00:01  model_time: 1.1042 (0.9865)  evaluator_time: 0.0308 (0.0327)  time: 1.1110  data: 0.0690  max mem: 11809
Test: Total time: 0:00:41 (1.0877 s / it)
Averaged stats: model_time: 1.1042 (0.9865)  evaluator_time: 0.0308 (0.0327)
Accumulating evaluation results...
DONE (t=0.10s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.046
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.159
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.009
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.040
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.075
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.171
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.036
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.158
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.297
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.217
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.407
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.353
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:06  model_time: 0.2260 (0.2260)  evaluator_time: 0.0078 (0.0078)  time: 0.2425  data: 0.0086  max mem: 11809
Test:  [26/27]  eta: 0:00:00  model_time: 0.1807 (0.1806)  evaluator_time: 0.0079 (0.0091)  time: 0.2049  data: 0.0187  max mem: 11809
Test: Total time: 0:00:05 (0.2078 s / it)
Averaged stats: model_time: 0.1807 (0.1806)  evaluator_time: 0.0079 (0.0091)
Accumulating evaluation results...
DONE (t=0.02s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.024
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.104
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.006
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.054
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.027
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.260
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.016
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.100
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.257
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.200
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.312
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.450
Epoch: [3]  [ 0/38]  eta: 0:01:57  lr: 0.000079  loss: 0.7420 (0.7420)  loss_classifier: 0.2441 (0.2441)  loss_box_reg: 0.2700 (0.2700)  loss_objectness: 0.1505 (0.1505)  loss_rpn_box_reg: 0.0773 (0.0773)  time: 3.0965  data: 0.0762  max mem: 11809
Epoch: [3]  [10/38]  eta: 0:01:09  lr: 0.000079  loss: 0.9614 (1.0051)  loss_classifier: 0.2594 (0.2689)  loss_box_reg: 0.4154 (0.3986)  loss_objectness: 0.1953 (0.2090)  loss_rpn_box_reg: 0.1335 (0.1285)  time: 2.4899  data: 0.0721  max mem: 11809
Epoch: [3]  [20/38]  eta: 0:00:44  lr: 0.000079  loss: 0.9614 (0.9811)  loss_classifier: 0.2592 (0.2594)  loss_box_reg: 0.3674 (0.3742)  loss_objectness: 0.1872 (0.2056)  loss_rpn_box_reg: 0.1354 (0.1419)  time: 2.4386  data: 0.0705  max mem: 11809
Epoch: [3]  [30/38]  eta: 0:00:20  lr: 0.000079  loss: 0.9029 (0.9494)  loss_classifier: 0.2130 (0.2364)  loss_box_reg: 0.2942 (0.3559)  loss_objectness: 0.1760 (0.2048)  loss_rpn_box_reg: 0.1595 (0.1523)  time: 2.5057  data: 0.0664  max mem: 11809
Epoch: [3]  [37/38]  eta: 0:00:02  lr: 0.000079  loss: 0.8414 (0.9376)  loss_classifier: 0.2027 (0.2314)  loss_box_reg: 0.2997 (0.3567)  loss_objectness: 0.1777 (0.2028)  loss_rpn_box_reg: 0.1502 (0.1468)  time: 2.5120  data: 0.0656  max mem: 11809
Epoch: [3] Total time: 0:01:34 (2.4779 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:45  model_time: 1.1015 (1.1015)  evaluator_time: 0.0340 (0.0340)  time: 1.2000  data: 0.0644  max mem: 11809
Test:  [37/38]  eta: 0:00:01  model_time: 0.9982 (0.9779)  evaluator_time: 0.0251 (0.0279)  time: 1.0701  data: 0.0700  max mem: 11809
Test: Total time: 0:00:40 (1.0756 s / it)
Averaged stats: model_time: 0.9982 (0.9779)  evaluator_time: 0.0251 (0.0279)
Accumulating evaluation results...
DONE (t=0.09s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.186
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.489
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.094
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.067
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.390
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.366
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.144
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.325
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.394
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.301
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.514
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.511
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:06  model_time: 0.2424 (0.2424)  evaluator_time: 0.0021 (0.0021)  time: 0.2540  data: 0.0094  max mem: 11809
Test:  [26/27]  eta: 0:00:00  model_time: 0.1794 (0.1817)  evaluator_time: 0.0037 (0.0071)  time: 0.2038  data: 0.0187  max mem: 11809
Test: Total time: 0:00:05 (0.2070 s / it)
Averaged stats: model_time: 0.1794 (0.1817)  evaluator_time: 0.0037 (0.0071)
Accumulating evaluation results...
DONE (t=0.02s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.119
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.389
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.035
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.061
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.293
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.255
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.092
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.266
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.336
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.291
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.424
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.350
Epoch: [4]  [ 0/38]  eta: 0:01:31  lr: 0.000065  loss: 0.9422 (0.9422)  loss_classifier: 0.2176 (0.2176)  loss_box_reg: 0.4247 (0.4247)  loss_objectness: 0.1274 (0.1274)  loss_rpn_box_reg: 0.1724 (0.1724)  time: 2.4109  data: 0.0765  max mem: 11809
Epoch: [4]  [10/38]  eta: 0:01:09  lr: 0.000065  loss: 0.7466 (0.8535)  loss_classifier: 0.2104 (0.2066)  loss_box_reg: 0.4051 (0.3726)  loss_objectness: 0.1383 (0.1470)  loss_rpn_box_reg: 0.1309 (0.1272)  time: 2.4828  data: 0.0750  max mem: 11809
Epoch: [4]  [20/38]  eta: 0:00:44  lr: 0.000065  loss: 0.7695 (0.8571)  loss_classifier: 0.2038 (0.2015)  loss_box_reg: 0.4186 (0.4167)  loss_objectness: 0.1079 (0.1223)  loss_rpn_box_reg: 0.1038 (0.1165)  time: 2.5032  data: 0.0765  max mem: 11809
Epoch: [4]  [30/38]  eta: 0:00:20  lr: 0.000065  loss: 0.7188 (0.7935)  loss_classifier: 0.1462 (0.1823)  loss_box_reg: 0.3790 (0.3854)  loss_objectness: 0.0886 (0.1096)  loss_rpn_box_reg: 0.1063 (0.1160)  time: 2.5388  data: 0.0723  max mem: 11809
Epoch: [4]  [37/38]  eta: 0:00:02  lr: 0.000065  loss: 0.6458 (0.7620)  loss_classifier: 0.1408 (0.1738)  loss_box_reg: 0.2956 (0.3659)  loss_objectness: 0.0689 (0.1063)  loss_rpn_box_reg: 0.1158 (0.1160)  time: 2.4125  data: 0.0685  max mem: 11809
Epoch: [4] Total time: 0:01:33 (2.4595 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:46  model_time: 1.1594 (1.1594)  evaluator_time: 0.0052 (0.0052)  time: 1.2281  data: 0.0634  max mem: 11809
Test:  [37/38]  eta: 0:00:01  model_time: 0.8099 (0.9748)  evaluator_time: 0.0104 (0.0128)  time: 1.0159  data: 0.0712  max mem: 11809
Test: Total time: 0:00:40 (1.0597 s / it)
Averaged stats: model_time: 0.8099 (0.9748)  evaluator_time: 0.0104 (0.0128)
Accumulating evaluation results...
DONE (t=0.05s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.404
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.847
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.352
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.245
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.572
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.498
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.203
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.458
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.487
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.381
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.638
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.511
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:07  model_time: 0.2605 (0.2605)  evaluator_time: 0.0016 (0.0016)  time: 0.2722  data: 0.0100  max mem: 11809
Test:  [26/27]  eta: 0:00:00  model_time: 0.1815 (0.1828)  evaluator_time: 0.0026 (0.0044)  time: 0.2017  data: 0.0194  max mem: 11809
Test: Total time: 0:00:05 (0.2064 s / it)
Averaged stats: model_time: 0.1815 (0.1828)  evaluator_time: 0.0026 (0.0044)
Accumulating evaluation results...
DONE (t=0.01s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.280
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.709
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.164
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.167
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.430
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.399
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.161
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.329
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.374
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.309
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.484
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.400
Epoch: [5]  [ 0/38]  eta: 0:01:58  lr: 0.000050  loss: 0.3979 (0.3979)  loss_classifier: 0.1435 (0.1435)  loss_box_reg: 0.1864 (0.1864)  loss_objectness: 0.0301 (0.0301)  loss_rpn_box_reg: 0.0380 (0.0380)  time: 3.1150  data: 0.0960  max mem: 11809
Epoch: [5]  [10/38]  eta: 0:01:13  lr: 0.000050  loss: 0.6914 (0.6669)  loss_classifier: 0.1452 (0.1621)  loss_box_reg: 0.2838 (0.3196)  loss_objectness: 0.0566 (0.0703)  loss_rpn_box_reg: 0.1243 (0.1149)  time: 2.6136  data: 0.0762  max mem: 11809
Epoch: [5]  [20/38]  eta: 0:00:45  lr: 0.000050  loss: 0.6965 (0.6750)  loss_classifier: 0.1479 (0.1583)  loss_box_reg: 0.3074 (0.3306)  loss_objectness: 0.0611 (0.0696)  loss_rpn_box_reg: 0.1185 (0.1165)  time: 2.4931  data: 0.0731  max mem: 11809
Epoch: [5]  [30/38]  eta: 0:00:20  lr: 0.000050  loss: 0.7086 (0.6942)  loss_classifier: 0.1597 (0.1636)  loss_box_reg: 0.3635 (0.3519)  loss_objectness: 0.0611 (0.0672)  loss_rpn_box_reg: 0.1033 (0.1115)  time: 2.4814  data: 0.0737  max mem: 11809
Epoch: [5]  [37/38]  eta: 0:00:02  lr: 0.000050  loss: 0.6600 (0.6864)  loss_classifier: 0.1426 (0.1579)  loss_box_reg: 0.3873 (0.3510)  loss_objectness: 0.0518 (0.0680)  loss_rpn_box_reg: 0.0931 (0.1094)  time: 2.4820  data: 0.0735  max mem: 11809
Epoch: [5] Total time: 0:01:36 (2.5271 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:44  model_time: 1.0984 (1.0984)  evaluator_time: 0.0075 (0.0075)  time: 1.1811  data: 0.0750  max mem: 11809
Test:  [37/38]  eta: 0:00:01  model_time: 1.1083 (1.0080)  evaluator_time: 0.0100 (0.0102)  time: 1.0707  data: 0.0746  max mem: 11809
Test: Total time: 0:00:41 (1.0937 s / it)
Averaged stats: model_time: 1.1083 (1.0080)  evaluator_time: 0.0100 (0.0102)
Accumulating evaluation results...
DONE (t=0.04s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.372
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.884
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.244
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.261
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.464
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.620
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.204
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.447
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.477
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.424
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.532
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.653
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:07  model_time: 0.2534 (0.2534)  evaluator_time: 0.0014 (0.0014)  time: 0.2656  data: 0.0106  max mem: 11809
Test:  [26/27]  eta: 0:00:00  model_time: 0.1798 (0.1822)  evaluator_time: 0.0022 (0.0031)  time: 0.2004  data: 0.0197  max mem: 11809
Test: Total time: 0:00:05 (0.2045 s / it)
Averaged stats: model_time: 0.1798 (0.1822)  evaluator_time: 0.0022 (0.0031)
Accumulating evaluation results...
DONE (t=0.01s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.264
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.767
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.131
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.176
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.381
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.467
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.149
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.353
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.379
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.332
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.440
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.467
Epoch: [6]  [ 0/38]  eta: 0:01:37  lr: 0.000035  loss: 0.5940 (0.5940)  loss_classifier: 0.1678 (0.1678)  loss_box_reg: 0.3270 (0.3270)  loss_objectness: 0.0390 (0.0390)  loss_rpn_box_reg: 0.0603 (0.0603)  time: 2.5589  data: 0.0939  max mem: 11809
Epoch: [6]  [10/38]  eta: 0:01:13  lr: 0.000035  loss: 0.5940 (0.5873)  loss_classifier: 0.1409 (0.1337)  loss_box_reg: 0.2741 (0.3102)  loss_objectness: 0.0390 (0.0540)  loss_rpn_box_reg: 0.0818 (0.0893)  time: 2.6191  data: 0.0737  max mem: 11809
Epoch: [6]  [20/38]  eta: 0:00:45  lr: 0.000035  loss: 0.5376 (0.5845)  loss_classifier: 0.1228 (0.1350)  loss_box_reg: 0.2782 (0.3198)  loss_objectness: 0.0332 (0.0481)  loss_rpn_box_reg: 0.0818 (0.0816)  time: 2.5340  data: 0.0726  max mem: 11809
Epoch: [6]  [30/38]  eta: 0:00:20  lr: 0.000035  loss: 0.5533 (0.5939)  loss_classifier: 0.1228 (0.1398)  loss_box_reg: 0.3310 (0.3274)  loss_objectness: 0.0360 (0.0457)  loss_rpn_box_reg: 0.0713 (0.0809)  time: 2.4926  data: 0.0761  max mem: 11809
Epoch: [6]  [37/38]  eta: 0:00:02  lr: 0.000035  loss: 0.6148 (0.6049)  loss_classifier: 0.1328 (0.1396)  loss_box_reg: 0.3310 (0.3294)  loss_objectness: 0.0430 (0.0490)  loss_rpn_box_reg: 0.0820 (0.0870)  time: 2.3783  data: 0.0771  max mem: 11809
Epoch: [6] Total time: 0:01:33 (2.4600 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:34  model_time: 0.8299 (0.8299)  evaluator_time: 0.0076 (0.0076)  time: 0.9133  data: 0.0756  max mem: 11809
Test:  [37/38]  eta: 0:00:01  model_time: 1.1000 (0.9976)  evaluator_time: 0.0071 (0.0073)  time: 1.0834  data: 0.0742  max mem: 11809
Test: Total time: 0:00:41 (1.0795 s / it)
Averaged stats: model_time: 1.1000 (0.9976)  evaluator_time: 0.0071 (0.0073)
Accumulating evaluation results...
DONE (t=0.04s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.484
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.935
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.443
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.321
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.628
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.727
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.239
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.529
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.542
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.432
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.675
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.763
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:07  model_time: 0.2619 (0.2619)  evaluator_time: 0.0012 (0.0012)  time: 0.2734  data: 0.0101  max mem: 11809
Test:  [26/27]  eta: 0:00:00  model_time: 0.1832 (0.1832)  evaluator_time: 0.0015 (0.0024)  time: 0.2006  data: 0.0199  max mem: 11809
Test: Total time: 0:00:05 (0.2051 s / it)
Averaged stats: model_time: 0.1832 (0.1832)  evaluator_time: 0.0015 (0.0024)
Accumulating evaluation results...
DONE (t=0.01s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.310
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.818
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.213
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.192
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.430
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.557
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.164
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.384
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.400
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.338
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.472
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.617
Epoch: [7]  [ 0/38]  eta: 0:01:33  lr: 0.000021  loss: 0.5481 (0.5481)  loss_classifier: 0.1172 (0.1172)  loss_box_reg: 0.3039 (0.3039)  loss_objectness: 0.0400 (0.0400)  loss_rpn_box_reg: 0.0870 (0.0870)  time: 2.4502  data: 0.0794  max mem: 11809
Epoch: [7]  [10/38]  eta: 0:01:12  lr: 0.000021  loss: 0.5544 (0.5401)  loss_classifier: 0.1203 (0.1236)  loss_box_reg: 0.3123 (0.2868)  loss_objectness: 0.0400 (0.0423)  loss_rpn_box_reg: 0.0798 (0.0874)  time: 2.5866  data: 0.0732  max mem: 11809
Epoch: [7]  [20/38]  eta: 0:00:46  lr: 0.000021  loss: 0.5271 (0.5330)  loss_classifier: 0.1203 (0.1246)  loss_box_reg: 0.2799 (0.2841)  loss_objectness: 0.0325 (0.0403)  loss_rpn_box_reg: 0.0776 (0.0841)  time: 2.5628  data: 0.0733  max mem: 11809
Epoch: [7]  [30/38]  eta: 0:00:20  lr: 0.000021  loss: 0.5073 (0.5088)  loss_classifier: 0.1161 (0.1216)  loss_box_reg: 0.2595 (0.2736)  loss_objectness: 0.0269 (0.0372)  loss_rpn_box_reg: 0.0676 (0.0765)  time: 2.5655  data: 0.0776  max mem: 11809
Epoch: [7]  [37/38]  eta: 0:00:02  lr: 0.000021  loss: 0.5151 (0.5203)  loss_classifier: 0.1264 (0.1262)  loss_box_reg: 0.2737 (0.2801)  loss_objectness: 0.0319 (0.0394)  loss_rpn_box_reg: 0.0586 (0.0745)  time: 2.5508  data: 0.0769  max mem: 11809
Epoch: [7] Total time: 0:01:36 (2.5359 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:41  model_time: 1.0089 (1.0089)  evaluator_time: 0.0103 (0.0103)  time: 1.0798  data: 0.0605  max mem: 11809
Test:  [37/38]  eta: 0:00:01  model_time: 0.9865 (0.9674)  evaluator_time: 0.0056 (0.0069)  time: 1.0244  data: 0.0729  max mem: 11809
Test: Total time: 0:00:39 (1.0474 s / it)
Averaged stats: model_time: 0.9865 (0.9674)  evaluator_time: 0.0056 (0.0069)
Accumulating evaluation results...
DONE (t=0.03s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.541
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.945
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.570
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.441
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.639
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.704
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.253
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.592
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.603
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.531
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.691
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.732
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:06  model_time: 0.2201 (0.2201)  evaluator_time: 0.0014 (0.0014)  time: 0.2319  data: 0.0103  max mem: 11809
Test:  [26/27]  eta: 0:00:00  model_time: 0.1803 (0.1808)  evaluator_time: 0.0017 (0.0024)  time: 0.1986  data: 0.0193  max mem: 11809
Test: Total time: 0:00:05 (0.2022 s / it)
Averaged stats: model_time: 0.1803 (0.1808)  evaluator_time: 0.0017 (0.0024)
Accumulating evaluation results...
DONE (t=0.01s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.325
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.837
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.120
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.244
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.457
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.376
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.164
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.396
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.400
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.357
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.480
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.383
Epoch: [8]  [ 0/38]  eta: 0:01:40  lr: 0.000010  loss: 0.3418 (0.3418)  loss_classifier: 0.0692 (0.0692)  loss_box_reg: 0.1284 (0.1284)  loss_objectness: 0.0284 (0.0284)  loss_rpn_box_reg: 0.1158 (0.1158)  time: 2.6375  data: 0.0760  max mem: 11809
Epoch: [8]  [10/38]  eta: 0:01:12  lr: 0.000010  loss: 0.4720 (0.4456)  loss_classifier: 0.1013 (0.1014)  loss_box_reg: 0.2065 (0.2294)  loss_objectness: 0.0440 (0.0413)  loss_rpn_box_reg: 0.0526 (0.0736)  time: 2.5751  data: 0.0727  max mem: 11809
Epoch: [8]  [20/38]  eta: 0:00:46  lr: 0.000010  loss: 0.4215 (0.4196)  loss_classifier: 0.0977 (0.1008)  loss_box_reg: 0.2152 (0.2265)  loss_objectness: 0.0254 (0.0328)  loss_rpn_box_reg: 0.0410 (0.0595)  time: 2.5553  data: 0.0710  max mem: 11809
Epoch: [8]  [30/38]  eta: 0:00:20  lr: 0.000010  loss: 0.4215 (0.4339)  loss_classifier: 0.0954 (0.1054)  loss_box_reg: 0.2279 (0.2376)  loss_objectness: 0.0238 (0.0328)  loss_rpn_box_reg: 0.0400 (0.0581)  time: 2.5267  data: 0.0708  max mem: 11810
Epoch: [8]  [37/38]  eta: 0:00:02  lr: 0.000010  loss: 0.4313 (0.4400)  loss_classifier: 0.1010 (0.1080)  loss_box_reg: 0.2333 (0.2402)  loss_objectness: 0.0288 (0.0334)  loss_rpn_box_reg: 0.0400 (0.0584)  time: 2.4018  data: 0.0706  max mem: 11810
Epoch: [8] Total time: 0:01:34 (2.4842 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:50  model_time: 1.2471 (1.2471)  evaluator_time: 0.0053 (0.0053)  time: 1.3245  data: 0.0719  max mem: 11810
Test:  [37/38]  eta: 0:00:01  model_time: 1.1016 (0.9941)  evaluator_time: 0.0045 (0.0053)  time: 1.0947  data: 0.0652  max mem: 11810
Test: Total time: 0:00:40 (1.0682 s / it)
Averaged stats: model_time: 1.1016 (0.9941)  evaluator_time: 0.0045 (0.0053)
Accumulating evaluation results...
DONE (t=0.02s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.630
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.960
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.674
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.498
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.761
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.806
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.290
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.662
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.672
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.568
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.804
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.832
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:06  model_time: 0.2289 (0.2289)  evaluator_time: 0.0011 (0.0011)  time: 0.2394  data: 0.0093  max mem: 11810
Test:  [26/27]  eta: 0:00:00  model_time: 0.1803 (0.1817)  evaluator_time: 0.0013 (0.0018)  time: 0.1984  data: 0.0187  max mem: 11810
Test: Total time: 0:00:05 (0.2017 s / it)
Averaged stats: model_time: 0.1803 (0.1817)  evaluator_time: 0.0013 (0.0018)
Accumulating evaluation results...
DONE (t=0.01s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.358
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.852
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.181
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.256
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.477
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.516
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.192
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.421
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.425
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.362
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.504
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.567
Epoch: [9]  [ 0/38]  eta: 0:02:02  lr: 0.000003  loss: 0.4375 (0.4375)  loss_classifier: 0.1176 (0.1176)  loss_box_reg: 0.2257 (0.2257)  loss_objectness: 0.0303 (0.0303)  loss_rpn_box_reg: 0.0640 (0.0640)  time: 3.2333  data: 0.0718  max mem: 11810
Epoch: [9]  [10/38]  eta: 0:01:10  lr: 0.000003  loss: 0.4375 (0.4118)  loss_classifier: 0.1176 (0.1040)  loss_box_reg: 0.2257 (0.2224)  loss_objectness: 0.0299 (0.0335)  loss_rpn_box_reg: 0.0370 (0.0519)  time: 2.5220  data: 0.0709  max mem: 11810
Epoch: [9]  [20/38]  eta: 0:00:45  lr: 0.000003  loss: 0.3503 (0.3862)  loss_classifier: 0.0887 (0.0981)  loss_box_reg: 0.1856 (0.2091)  loss_objectness: 0.0238 (0.0295)  loss_rpn_box_reg: 0.0399 (0.0495)  time: 2.4813  data: 0.0697  max mem: 11810
Epoch: [9]  [30/38]  eta: 0:00:20  lr: 0.000003  loss: 0.3618 (0.3908)  loss_classifier: 0.0894 (0.0992)  loss_box_reg: 0.1929 (0.2089)  loss_objectness: 0.0216 (0.0320)  loss_rpn_box_reg: 0.0459 (0.0506)  time: 2.5028  data: 0.0669  max mem: 11810
Epoch: [9]  [37/38]  eta: 0:00:02  lr: 0.000003  loss: 0.3403 (0.3754)  loss_classifier: 0.0894 (0.0957)  loss_box_reg: 0.2003 (0.2014)  loss_objectness: 0.0202 (0.0304)  loss_rpn_box_reg: 0.0344 (0.0479)  time: 2.4581  data: 0.0650  max mem: 11810
Epoch: [9] Total time: 0:01:34 (2.4839 s / it)
creating index...
index created!
Test:  [ 0/38]  eta: 0:00:48  model_time: 1.2255 (1.2255)  evaluator_time: 0.0024 (0.0024)  time: 1.2869  data: 0.0588  max mem: 11810
Test:  [37/38]  eta: 0:00:01  model_time: 0.8375 (0.9799)  evaluator_time: 0.0046 (0.0048)  time: 1.0073  data: 0.0699  max mem: 11810
Test: Total time: 0:00:40 (1.0526 s / it)
Averaged stats: model_time: 0.8375 (0.9799)  evaluator_time: 0.0046 (0.0048)
Accumulating evaluation results...
DONE (t=0.03s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.639
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.960
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.695
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.502
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.772
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.839
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.297
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.675
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.685
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.581
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.815
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.858
creating index...
index created!
Test:  [ 0/27]  eta: 0:00:06  model_time: 0.2359 (0.2359)  evaluator_time: 0.0010 (0.0010)  time: 0.2464  data: 0.0094  max mem: 11810
Test:  [26/27]  eta: 0:00:00  model_time: 0.1793 (0.1815)  evaluator_time: 0.0012 (0.0017)  time: 0.1971  data: 0.0180  max mem: 11810
Test: Total time: 0:00:05 (0.2008 s / it)
Averaged stats: model_time: 0.1793 (0.1815)  evaluator_time: 0.0012 (0.0017)
Accumulating evaluation results...
DONE (t=0.01s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.380
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.848
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.285
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.271
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.508
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.531
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.195
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.442
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.447
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.383
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.528
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.583

In [51]:
a = torch.load('/content/gdrive/My Drive/Colab Notebooks/resnet101v2.pt')
model = get_model('resnet101')
model.load_state_dict(a[1])
model.to(device)
test_predictions(model, test_data_loader, 55)
In [ ]:
plt.plot(compare_result['validition']['average precision'], label='after')
plt.plot(dirty_result['validition']['average precision'], label='before')

plt.legend()
plt.show()
In [11]:
import torchvision
from torchvision.models import *

models_list = [alexnet, 
               densenet121, googlenet, inception_v3, 
               mobilenet_v2, resnet101, resnet152, 
               resnet50, squeezenet1_0]

models_name = ['alexnet', 'densenet121', 'googlenet', 
               'inception_v3', 'mobilenet_v2', 'resnet101', 
               'resnet152', 'resnet50', 'squeezenet1_0']

find_max = 0
for model, name in zip(models_list, models_name):
    t = model(pretrained=True)
    pytorch_total_params = sum(p.numel() for p in t.parameters())
    print(f'{name}: {pytorch_total_params}')
    if pytorch_total_params > find_max:
        find_max =  pytorch_total_params
        max_model_name = name
print(f'Max:\n{max_model_name}: {find_max}')
alexnet: 61100840
densenet121: 7978856
googlenet: 6624904
inception_v3: 27161264
mobilenet_v2: 3504872
resnet101: 44549160
resnet152: 60192808
resnet50: 25557032
squeezenet1_0: 1248424
Max:
alexnet: 61100840
In [17]:
num_classes=2
anchor_generator = AnchorGenerator(
                        sizes=tuple([(32, 64, 128, 256) for _ in range(5)]),
                        aspect_ratios=tuple([(0.25, 0.5, 1.0, 2.0) for _ in range(5)]))
backbone = torchvision.models.alexnet(pretrained=True).features
backbone.out_channels = 256
model = FasterRCNN(backbone, num_classes,
                    rpn_anchor_generator=anchor_generator,
                    rpn_head=RPNHead(backbone.out_channels, anchor_generator.num_anchors_per_location()[0]))
in_features = model.roi_heads.box_predictor.cls_score.in_features
# replace the pre-trained head with a new one
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes = num_classes)
Downloading: "https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth" to /root/.cache/torch/hub/checkpoints/alexnet-owt-4df8aa71.pth

In [13]:
CHECKPOINT_DIR_PATH = '/content/gdrive/My Drive/Colab Notebooks/alexnet.pt'
In [16]:
# move model to the right device
model.to(device)

# let's train it for 10 epochs
num_epochs = 10

#Set Optimizer
optimizer = adamod.AdaMod(model.parameters(), lr=0.0001, weight_decay=0.0001, beta3=0.999)
# and a learning rate scheduler
lr_scheduler = LegacyCosineAnnealingLR(optimizer, T_max=num_epochs)

# Learning Rate Scheduler
lr_scheduler = LegacyCosineAnnealingLR(optimizer, T_max=num_epochs)
lr_scheduler.eta_min = 1.0000000000000002e-07

compare_result = run()
torch.save([compare_result, model.state_dict()], CHECKPOINT_DIR_PATH)
In [45]:
resnet = torch.load('/content/gdrive/My Drive/Colab Notebooks/resnet101v2.pt')
alexnet = torch.load('/content/gdrive/My Drive/Colab Notebooks/alexnet.pt')
plt.title('average precision', fontsize=20)
plt.plot(resnet[0]['validition']['average precision'], label='resnet 101')
plt.plot(alexnet['validition']['average precision'], label='alexnet')

plt.legend()
plt.show()
In [48]:
test_predictions(model, test_data_loader, 50)

As can be seen from the graph and forecasts choosing an architecture with a lot of features will not contribute to the improvement of the model and in fact, will hurt and lower the average accuracy. The model enters a state of overfitting. The model fails to detect masks even when they appear close and clear in the image whereas detecting wrongly other objects as face masks. The conclusion, from the three models which were trained the largest model achieved the worst result.

In [ ]: